Thilina's workspace
Runs
1
Name
1 visualized
State
Notes
User
Tags
Created
Runtime
Sweep
adafactor_clip_threshold
adafactor_decay_rate
adafactor_eps
adafactor_relative_step
adafactor_scale_parameter
adafactor_warmup_init
adam_epsilon
best_model_dir
cache_dir
cosine_schedule_num_cycles
custom_layer_parameters
custom_parameter_groups
dataloader_num_workers
do_lower_case
do_sample
dynamic_quantize
early_stopping
early_stopping_consider_epochs
early_stopping_delta
early_stopping_metric
early_stopping_metric_minimize
early_stopping_patience
eval_batch_size
evaluate_during_training
evaluate_during_training_silent
evaluate_during_training_steps
evaluate_during_training_verbose
evaluate_each_epoch
evaluate_generated_text
fp16
gradient_accumulation_steps
learning_rate
length_penalty
local_rank
logging_steps
max_grad_norm
max_length
max_seq_length
max_steps
model_class
model_name
model_type
multiprocessing_chunksize
n_gpu
Finished
thilina
10h 7m 3s
-
1
-0.8
[9.999999999999999e-31,0.001]
false
false
false
1.0000e-8
outputs/best_model
cache_dir/
0.5
[]
[]
0
false
false
false
true
false
0
eval_loss
true
3
64
true
true
30000
true
true
false
false
1
0.001
2
-1
50
1
20
24
-1
T5Model
google/mt5-base
mt5
500
1
1-1
of 1